Model Selection

Wav2Vec2 fine-tuning

# Wav2Vec2 fine-tuning

Wav2vec2 Ser English Finetuned

This model is fine-tuned based on the Wav2Vec2 architecture, specifically designed to recognize six emotional states (sadness, anger, disgust, fear, happiness, neutral) in English speech, with an accuracy of 92.42%.

Audio Classification English

My Awesome Mind Model

An audio classification model fine-tuned on the minds14 dataset based on the facebook/wav2vec2-base model

Audio Classification

Baby Cry Classification Finetuned Babycry V4

A baby cry classification model fine-tuned based on wav2vec2-large-xlsr-53-english, achieving 81.5% accuracy

Audio Classification

W2v Speech Emotion Recognition

A Wav2Vec2-fine-tuned English speech emotion recognition model capable of identifying six emotional states

Audio Classification English

Arabic Speech Syllables Recognition Using Wav2vec2

This is a Wav2Vec2-based Arabic syllable recognition model capable of identifying syllables in Modern Standard Arabic from speech.

Speech Recognition

Transformers Arabic

Wav2vec2 Ljspeech Gruut

A phoneme recognition model based on the Wav2Vec2 architecture, fine-tuned on the LJSpeech Phonemes dataset, used to convert speech into phoneme sequences

Speech Recognition

Transformers English

Wav2vec English Speech Emotion Recognition

English speech emotion recognition model fine-tuned based on Wav2Vec 2.0, capable of recognizing 7 different emotions

Audio Classification

Malaya Speech Fine Tune Realcase 30 Jun Lm

This model is a fine-tuned version of malay-huggingface/wav2vec2-xls-r-300m-mixed on the uob_singlish dataset, mainly used for speech recognition tasks.

Speech Recognition

This is a French speech recognition model fine-tuned based on facebook/wav2vec2-base-960h, achieving a word error rate of 1.0 on the evaluation set.

Speech Recognition

Model Facebookptbrlarge

A Brazilian Portuguese speech recognition model fine-tuned on the Common Voice dataset based on Facebook's wav2vec2-large-xlsr-53-portuguese model

Speech Recognition

Wav2vec2 Base Common Voice 50p Persian Colab

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base for Persian language, supporting Persian speech-to-text tasks.

Speech Recognition

Wav2vec2 Xls R 300m Mr Cv9 With Lm

An automatic speech recognition model fine-tuned on Marathi speech datasets based on Facebook's XLS-R-300M model

Speech Recognition

Transformers Other

English Filipino Wav2vec2 L Xls R Test 09

English-Filipino speech recognition model fine-tuned from jonatasgrosman/wav2vec2-large-xlsr-53-english, achieving a WER of 0.5750 on the evaluation set

Speech Recognition

English Filipino Wav2vec2 L Xls R Test 06

This model is a fine-tuned version of jonatasgrosman/wav2vec2-large-xlsr-53-english on the filipino_voice dataset, designed for English and Filipino speech recognition tasks.

Speech Recognition

Gram Vaani Harveen Chadda Fine Tuning

This is a speech recognition model fine-tuned based on Harveenchadha/vakyansh-wav2vec2-hindi-him-4200, supporting Hindi speech-to-text tasks.

Speech Recognition

Automatic speech recognition model fine-tuned on Mozilla Common Voice Portuguese dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr 53 Coraa Brazilian Portuguese Gain Normalization

This is a Wav2vec 2.0 model fine-tuned for Portuguese, trained on multiple Portuguese speech datasets including CORAA, CETUC, MLS, etc.

Speech Recognition

Transformers Other

Wav2vec2 Xlsr Multilingual 53 Fa

A multilingual speech recognition model based on the wav2vec 2.0 architecture, specifically fine-tuned for Persian, significantly reducing word error rate

Speech Recognition

Wav2vec2 Base Vietnamese

Vietnamese speech recognition model based on Wav2Vec2 architecture, fine-tuned on VSLP dataset, supports 16kHz sampled speech input

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr Turkish

This is an automatic speech recognition model fine-tuned on the Turkish Common Voice dataset based on the facebook/wav2vec2-large-xlsr-53 model, achieving a test WER of 21.13%.

Speech Recognition Other

Wav2vec2 Large Xlsr Rm Sursilv

This is an automatic speech recognition model fine-tuned from the facebook/wav2vec2-large-xlsr-53 model, specifically designed for recognizing the Sursilvan dialect of Romansh.

Speech Recognition

Wav2vec2 Xls R 300m Lm Hebrew

Hebrew speech recognition model fine-tuned from facebook/wav2vec2-xls-r-300m with n-gram language model enhancement

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr Breton

A speech recognition model fine-tuned on the Breton Common Voice dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition Other

Wav2vec2 Large Xlsr 53 Telugu

A Telugu speech recognition model fine-tuned based on the facebook/wav2vec2-large-xlsr-53 model, trained using the OpenSLR SLR66 dataset

Speech Recognition Other

Xls R Spanish Test

This is an automatic speech recognition (ASR) model fine-tuned on the Spanish Common Voice 7 dataset, based on the facebook/wav2vec2-large-xlsr-53 model.

Speech Recognition

Transformers Spanish

Wav2vec2 Large Xlsr Greek 1

A speech recognition model fine-tuned on Greek language based on facebook/wav2vec2-large-xlsr-53, supporting 16kHz sampled audio input.

Speech Recognition

Transformers Other

German speech recognition model fine-tuned based on flozi00/wav2vec-xlsr-german

Speech Recognition

This model is a speech recognition model fine-tuned on the Common Voice 7.0 Vietnamese dataset and private datasets based on facebook/wav2vec2-xls-r-300m.

Speech Recognition

Transformers Other

Wav2vec2 Large XLSR 53 Assamese

Assamese automatic speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, trained using the Common Voice dataset

Speech Recognition Other

Wav2vec2 Large Xlsr Greek 2

A speech recognition model fine-tuned on the Greek Common Voice dataset based on facebook/wav2vec2-large-xlsr-53, balancing the training set with synthesized female voice data

Speech Recognition

Transformers Other

This model is an automatic speech recognition (ASR) model fine-tuned on the Turkish COMMON_VOICE dataset based on cahya/wav2vec2-base-turkish-artificial-cv

Speech Recognition

Transformers Other

This is a Wav2vec 2.0 model fine-tuned for Brazilian Portuguese, trained on multiple Brazilian Portuguese datasets, achieving a WER of 13.6 on the Common Voice test set.

Speech Recognition

Transformers Other

Wav2vec2 Xlsr Punjabi

An automatic speech recognition model fine-tuned for Punjabi using the Common Voice dataset, based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Automatic speech recognition model fine-tuned on Swedish dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Transformers Other

Bp Voxforge1 Xlsr

This is a Wav2Vec2 model fine-tuned for Brazilian Portuguese speech recognition tasks, trained on the VoxForge dataset.

Speech Recognition

Transformers Other

Wav2vec2 Large Voxrex Npsc

An automatic speech recognition model fine-tuned on the NBAILAB/NPSC - 16K_MP3 dataset based on KBLab/wav2vec2-large-voxrex

Speech Recognition

Hausa automatic speech recognition model based on wav2vec2-xls-r-300m architecture, fine-tuned on Common Voice 8.0 Hausa dataset

Speech Recognition

Transformers Other

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase